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Background / Context: 

Despite the theoretieal effeetiveness of stratifieation methods sueh as those based on the 
propensity score (PS), the opacity and uniqueness of most enacted treatment selection 
mechanisms in nonexperimental social science research make it difficult to know a priori the 
appropriate covariates on which to stratify. Yet, because the plausibility of strong ignorability of 
the treatment assignment and corresponding inferences are highly dependent upon the selected 
covariates, a central concern in pretreatment stratification methods is how to which covariates to 
stratify on (e.g. Cook, Steiner & Pohl, 2009). In particular, stratifications that are too coarse (e.g. 
too few relevant covariates) likely gives rise to biased treatment estimates (e.g. Smith & Todd, 
2005). Conversely, the inclusion of bias amplifying or extraneous covariates has also been 
shown to import extra bias and degrade the efficiency of the treatment effect estimator (e.g. 
Brookhart et ah, 2006; Pearl, 2010; Wooldridge, 2009). For instance, stratifying on instrumental 
variables and covariates with colliding paths potentially increases bias and variance over 
unstratified estimates. Of relevance to this study is collider bias originating from stratification on 
pretreatment variables thought to form an M structural design (Figure 1). Given a treatment, Z, 
an observed pretreatment covariate, X, two unobserved and independent pretreatment covariates, 
Ui and U 2 , and an outcome, Y, the covariate Xis a collider and may amplify bias when assessing 
the effect of the treatment on the outcome. That is, when two variables (e.g. Ui and U 2 ) share a 
common effect (e.g. X), stratification on that effected variable (e.g. X) induces a statistical 
relation between otherwise independent factors (e.g. Ui and U 2 ) (Figure 2). In turn, because these 
unobserved independent covariates are also causes of the treatment and outcome, stratifying on 
only X further induces a spurious relation between the treatment and outcome beyond the true 
treatment effect (i.e. collider bias). However, because the observed covariate, X, is hypothesized 
to be a confounder (e.g. Figure 3), concerns about confounding bias frequently dominate the 
potential for collider bias from unobserved bias amplifying covariates (e.g. Greenland, 2003). As 
a result, modal advice has been to stratify along a rich combination of observed covariates (e.g. 
Rubin & Thomas, 1996; Stuart & Rubin, 2007; Stuart, 2010). Yet, recent empirical 
investigations have demonstrated sizeable bias potentially corresponding to such collider bias 
especially with saturated stratifications (Whitcomb, Schisterman, Perkins & Platt, 2009; Steiner, 
Cook, Shadish & Clark, 2010). Such applications indicate the complexity of applying principles 
and suggest that there is much more to bias reduction than simply stratifying on many covariates. 

Purpose / Objective / Research Question / Focus of Study: 

Of particular import to this study, is collider bias originating from stratification on 
pretreatment variables forming an embedded M or bowtie structural design (Figure 4). That is, 
rather than assume an M structural design which suggests that X is a collider but not a 
confounder, we adopt what we consider to be a more reasonable position and that is X is both a 
collider and confounder. Accordingly, in this study we examined the extent to which confounder 
induced bias exceeds collider induced bias. To inform this tradeoff, we quantified the bias from 
two simple linear model estimators which are asymptotically equivalent to stratification and 
matching on these variables (alone or with the propensity score) (e.g. Pearl, 2009). More 
specifically, we examined this tradeoff by quantifying the net bias induced from adjusting forX 
versus the net bias from ignoring it. As a result, stratifying on X removes confounding bias but 
induces collider bias whereas ignoring X alleviates collider bias but invokes confounding bias. 
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For that reason, this study quantified the threshold by whieh collider and confounder bias due to 
a hypothesized confounder (e.g. X) is equal. The intention is to provide pragmatic guidance as to 
the consequences of and the decision to stratify on covariates hypothesized to be confounders 
and/or colliders. 

Setting: 

(May not be applicable for Methods submissions) 

Population / Participants / Subjects: 

(May not be applicable for Methods submissions) 

Intervention / Program / Practice: 

(May not be applicable for Methods submissions) 

Significance / Novelty of study: 

To a large extent there are competing theories and evidence as to which set of variables 
one should stratify on to approximate the strong ignorability of treatment assignment (e.g. Rubin, 
2009; 2001; Pearl, 2010; Steiner et ah, 2010). One (experimentalist) perspective has primarily 
suggested stratifying along a rich set of variables to produce covariate balance across treatment 
groups on all observed variables. In opposition, other (structuralist) perspectives are particularly 
concerned with the fundamental structure of germane (un)observed variables (e.g. Pearl, 2010). 
The surrounding empirical literature has demonstrated support for both sides (e.g. Steiner et al., 
2010; Rubin, 2001). In this study, we take on explication of the conditions under which 
confounding bias dominates collider bias. In particular, this study develops an approach to 
quantify the conditions under which the net bias (confounding plus collider) is reduced through 
stratification on a confounder/collider. 

Statistical, Measurement, or Econometric Model: 

In assessing the unique relationship between Z and Y given in the directed acyclic graph 
in Figure 4, we may choose to stratify on X or not. As X is both a collider and confounder, either 
approach will address one form of bias but induce another. In order to assess this exchange, we 
might quantify the change in bias by identifying the threshold by which the potential collider 
bias introduced by including X exceeds the observed confounding bias induced by omitting X. 
That is, we might construct and stratify on a propensity score with or without variable X or with 
asymptotic equivalence (e.g. Pearl, 2009) we might consider the equations 

Y,=l3, + p,X, + d^Z,+e, (1.1) 

i;. = /!„ + 4Z, + (1.2) 

Given the variable relationships in Figure 4, the estimator <5^ in equation (1.1) addresses the 
confounding bias brought about by X, but induces collider stratification bias as a result of the 
conditional relationships with the unobserved variables. In contrast, the estimator <5^ in equation 

(1.2) neglects the confounding bias but circumvents the collider bias. Because in practice we 
have not measured the unobserved variables, we cannot stratify on the unobserved variables and 
X to address both confounding and collider bias. However, because the introduction of collider 
bias is limited by the observed relationships of X with Z and Y, we can assess the change in net 
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bias. Upon stratifying onX, confounding bias has been eliminated and the remaining bias is that 
of the eollider 

Bias{8^) = d^-d (1.3) 

Similarly, bias from the unstratified estimator is a result of eonfounding bias only 

Bias{Sj = 8^-d (1.4) 

The differenee between eollider and eonfounding bias between the estimators is 

I Bias{8^)\ - 1 Bias{8j\=\ [E{8^) - d] | -[E(8J - 5] | (1.5) 

This differenee (1.5) ean take on the following four situations 

(1) 4 > d , 4 > 5 ; [E(d^ )-8]-[E{dJ-d] = Ei8^)-Ei8J 
{2)§^ <d,6^>8:[d- E(§^)] - [E(dJ -5] = 2d- E(d^) - E{8J 

(3) > d , < d : [E{E^ ) - d ] - [5 - if (5 J] = E{8^ ) - E(dJ - 2d 

(4) 4 < < d : [d - if(4)] - [d - E(dJ] = E(dJ - if(4) 

For brevity we foeus on the most eommon situation where there is a positive treatment effeet and 
the eonfounding variables are positively eorrelated with the treatment and outeome sueh that 
(1.1) underestimates and (1.2) overestimates the treatment effeet as summarized in (2) in (1.6). 
Rewriting (2) in (1.6) as the least squares estimators using eorrelation eoeffioients 

Changein Bias = 26 - — PyxPxz ^-^ _ |- — y ^ j (1.7) 

1-Pxz CTz 

where p and a indieate the appropriate eorrelation and standard deviation. This equation 
expresses the ehange in net bias from both eolliding and eonfounding. Setting the bias terms 
equal to eaeh other, we ean obtain a threshold by whieh eolliding and eonfounding bias are 
similar: 

[— Pyz - ^] = (1.8) 

a o 1-/0 

Z Z r XZ 



Equation (1.8) depiets when one form of bias dominates the other. For instanee, when 
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)] 



(1.9) 



the net bias from eonfounding will exeeed the net bias from eolliding and we should stratify on 
X. Similar derivations ean establish a threshold with respeet to evaluative measures whieh further 
ineorporate the variability of the estimator sueh as the mean-squared error. More speeifieally, 

M5£'(<5) = (bias(<5))" + var((5) (1.10) 



Rewriting (1.10) for both estimators using eorrelations, we have the ehange in MSE as 
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where similar eomparisons and thresholds ean be made. 



Research Design: 

(May not be applicable for Methods submissions) 



Data Collection and Analysis: 

(May not be applicable for Methods submissions) 



Findings / Results: 

Figure 5 graphieally displays the ehange in bias (| Bias{6 ^ ) | - 1 Bias(3^) | ) as a funetion 

of the treatment-eonfounder (Z-X) eorrelation for several outeome-eonfounder (Y-X) eorrelations 
using a medium treatment effeet size of 0.5 and standardized variables. More speeifieally, 
negative 'Change in Bias' values indieate situations where it is better to stratify onXbeeause 
estimators based on this stratifieation tend to offer a less biased estimate than those whieh 
exeludeX Evident from the example depleted in Figure 5, it is often more important to address 
eonfounding bias by stratifying on X than eollider bias by exeluding X. The exeeption eomes 
when outeome-eonfounder (Y-X) eorrelation exeeeds that of the outeome-treatment (Y-Z) and the 
treatment-eonfounder (Z-X) eorrelation is very high (>0.90). Similar plots of different effeet sizes 
indieated that the outeome-eonfounder tends to need eorrelations similar or greater than the 
outeome-treatment eorrelations for eollider bias to be of praetieal eoneem in struetural systems 
that eontain embedded M-relationships. 

Usefulness / Applicability of Method: 

To ground the relevanee of this approaeh, we diseuss a simplified applieation assessing 
the effeet of teaeher instruetional praetiee in reading on student reading aehievement while 
adjusting for teaehers' reading knowledge. We very briefly frame the study and deseribe the 
potential for both eonfounder and eollider bias. With renewed emphasis on observation of 
enaeted elassroom proeess (e.g. teaehing) as a eentral feature of researeh designs, there has been 
substantial development of a diverse set of standardized elassroom observations systems 
foeusing on direet assessments (e.g., Cameron, Connor, & Morrison, 2005). The expeetation is 
that sueh systems will help uneover reliable evidenee eoneerning the proeesses whieh drive 
teaehers' eontribution to students' growth. To address variation aeross and within teaehers in the 
nature their instruetion, the Assessment of Pedagogieal Knowledge of Teaehers of Reading study 
(APK) sought to investigate what instruetional quality is and the extent to whieh it aetually 
matters. In partieular, the study foeused on measuring instruetion in first through third grade 
teaehers from urban sehool distriets in Miehigan using several observation methods. Further, the 
study eentered on identifying and summarizing those instruetional praetiees that are assoeiated 
with students’ gains in reading over the eourse of a year in early literaey instruetion. To eapture 
the eontent, style, and delivery of eaeh lesson, the study developed an observation system whieh 
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had trained observers reeord observable instruetional aetions (lAs) within individual lessons. lAs 
ineluded lesson features like giving direetions, assessing student work, providing opportunities 
for students to partieipate or teaeher solieiting student partieipation. Further, beeause teaehers 
had undergone extensive professional development, the study assessed teaehers' reading 
knowledge with emphasis on the knowledge about reading teaehers draw on to teaeh early 
reading. 

The part we foeus on is the assoeiation of teaehers' instruetional reading praetiee on 
students' reading aehievement when adjusting for teaehers' reading knowledge. In partieular, 
beeause we suspeeted that teaehers' knowledge impaets students' aehievement both through 
instruetional reading praetiee as well as through other elassroom instruetional domains, teaehers' 
reading knowledge may represent an important eonfounding variable. However, teaehers' reading 
knowledge may also represent a eolliding variable. For example, teaehers' reading knowledge 
and instruetional reading praetiee may have a eommon eause sueh as repeated exposure to 
professional development in reading. Similarly, teaehers' reading knowledge and students' 
reading aehievement may also share a eommon souree sueh as teaehers' general knowledge or 
ability to eommunieate effeetively. Beeause it is suspeeted that teaehers' reading knowledge 
informs instruetional praetiee in reading as well as students reading outeomes through other 
ehannels, there is a potential for eonfounding bias if teaehers' reading knowledge is omitted. 
Similarly, beeause we measured teaehers' reading knowledge but did not appropriately measure 
professional development and general knowledge, there is a potential for eollider bias when 
eontrolling for teaehers' reading knowledge. This hypothesized relationship is depieted in Figure 
6 sueh that it forms an embedded M or bowtie strueture. To assess the potential tradeoff between 
eonfounder and eollider bias, we ean apply the above derivations to the empirieal data. The 
eorrelation between the outcome, the Iowa Test of Basic Skills-Reading Comprehension, and the 
measure of instructional reading practice, was approximately 0.2. Similarly, the correlation 
between the outcome and our measure of teachers' reading knowledge was about 0.1 whereas the 
teachers' reading knowledge was correlated with practice at 0.4. Applying the above thresholds, 
the observed data strongly suggests we should make adjustments for teachers' reading knowledge 
as the potential reduction in bias from its adjustment likely exceeds the potential collider bias 
introduced by not adjusting for it (Figure 7). More specifically, in absence of the true treatment 
effect we graphed the change in bias curve for effect sizes of 0.1, 0.2 and 0.4. Our results 
indicated that the treatment effect size would have to exceed 0.45 in order for collider bias to be 
of more concern than eonfounder bias. Given a zero order correlation between the treatment and 
outcome of 0.20, it seems highly unlikely that the effect size would be of such magnitude. 

Conclusions: 

There are clear potential benefits of empirically appraising the collider-confounder bias 
exchange. At a minimum, it helps to understand and bound the extent to which covariates' may 
serve to amplify or reduce bias and, in turn, mount a more informed evidentiary basis. Such 
appraisals also serve to shift issues surrounding stratification on colliders from a theoretical 
exercise to an empirical one. For instance, in the given example, rather than (not) stratify on 
teachers' reading knowledge solely on the theoretical basis that it is (not) a collider, such 
analyses allow empirical assessment of the variable's impact if it were a collider. 
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Appendix B, Tables and Figures 

Figure 1; Variables forming an M struetural relationship with a treatment, Z, an observed 
pretreatment eovariate, X two unobserved and independent pretreatment covariates, Uj and U 2 , 
and an outcome, Y. The pretreatment covariate X is a collider and may amplify bias when 
assessing the effect of the treatment on the outcome. 




Figure 2: Stratification onXin Figure 1 produces a spurious relation between Z and 7 beyond 
their true relation since Ui and U 2 both effect X 




Figure 3: Xas a confounder of the relationship between Z and Y. 





; Embedded M or bowtie structural relations. Here X is both a confounder and collider. 
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Figure 5: Change in bias (| Bias(d ^ ) | - 1 Bias(3^) | ) as a funetion of the treatment-eonfounder (Z- 

X) correlation for several outcome-confounder (Y-X) correlations using a 0.5 effect size. When 
'Change in Bias' is less than zero, it is bias is reduced by stratifying onX 




0.0 0.2 0.4 0.6 0.8 1.0 

rho_xz 



Figure 6: Structural relations among variables given in practical example. We would like to 
assess the association of practice, Prac, on students' reading comprehension achievement, RC, 
where teachers' reading knowledge, TK, is both a confounder and collider as professional 
development, PD, and teachers' general knowledge, GK, are both unobserved. 




Figure 7: Change in bias (| Bias(6 ^ ) | - 1 Bias(d^) | ) as a function of the treatment-eonfounder (Z- 

X) correlation for several effect sizes for instructional practice example. When 'Change in Bias' is 
less than zero, it suggests that bias is reduced by stratifying on teachers' reading knowledge. 
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